60 research outputs found

    Motion Tracking and Potentially Dangerous Situations Recognition in Complex Environment

    Get PDF
    In recent years, video surveillance systems have been playing a significantly important role in the human safety and security field by monitoring public or private areas. In this chapter, we have discussed the development of an intelligent surveillance system to detect, track and identify potentially hazardous events that may occur at level crossings (LC). This system starts by detecting and tracking objects on the level crossing. Then, a danger evaluation method is built using hidden Markov model in order to predict trajectories of the detected objects. The trajectories are analyzed with a credibility model to evaluate dangerous situations at level crossings. Synthetics and real data are used to test the effectiveness and the robustness of the proposed algorithms and the whole approach by considering various scenarios within several situations

    Learning to recognise 3D human action from a new skeleton-based representation using deep convolutional neural networks

    Get PDF
    Recognising human actions in untrimmed videos is an important challenging task. An effective three-dimensional (3D) motion representation and a powerful learning model are two key factors influencing recognition performance. In this study, the authors introduce a new skeleton-based representation for 3D action recognition in videos. The key idea of the proposed representation is to transform 3D joint coordinates of the human body carried in skeleton sequences into RGB images via a colour encoding process. By normalising the 3D joint coordinates and dividing each skeleton frame into five parts, where the joints are concatenated according to the order of their physical connections, the colour-coded representation is able to represent spatio-temporal evolutions of complex 3D motions, independently of the length of each sequence. They then design and train different deep convolutional neural networks based on the residual network architecture on the obtained image-based representations to learn 3D motion features and classify them into classes. Their proposed method is evaluated on two widely used action recognition benchmarks: MSR Action3D and NTU-RGB+D, a very large-scale dataset for 3D human action recognition. The experimental results demonstrate that the proposed method outperforms previous state-of-the-art approaches while requiring less computation for training and prediction

    Skeletal Movement to Color Map: A Novel Representation for 3D Action Recognition with Inception Residual Networks

    Get PDF
    This paper has been presented at : 25th IEEE International Conference on Image Processing (ICIP)We propose a novel skeleton-based representation for 3D action recognition in videos using Deep Convolutional Neural Networks (D-CNNs). Two key issues have been addressed: First, how to construct a robust representation that easily captures the spatial-temporal evolutions of motions from skeleton sequences. Second, how to design D-CNNs capable of learning discriminative features from the new representation in a effective manner. To address these tasks, a skeleton-based representation, namely, SPMF (Skeleton Pose-Motion Feature) is proposed. The SPMFs are built from two of the most important properties of a human action: postures and their motions. Therefore, they are able to effectively represent complex actions. For learning and recognition tasks, we design and optimize new D-CNNs based on the idea of Inception Residual networks to predict actions from SPMFs. Our method is evaluated on two challenging datasets including MSR Action3D and NTU-RGB+D. Experimental results indicated that the proposed method surpasses state-of-the-art methods whilst requiring less computation

    Video-based human action recognition using deep learning: a review

    Get PDF
    Human action recognition is an important application domain in computer vision. Its primary aim is to accurately describe human actions and their interactions from a previously unseen data sequence acquired by sensors. The ability to recognize, understand and predict complex human actions enables the construction of many important applications such as intelligent surveillance systems, human-computer interfaces, health care, security and military applications. In recent years, deep learning has been given particular attention by the computer vision community. This paper presents an overview of the current state-of-the-art in action recognition using video analysis with deep learning techniques. We present the most important deep learning models for recognizing human actions, analyze them to provide the current progress of deep learning algorithms applied to solve human action recognition problems in realistic videos highlighting their advantages and disadvantages. Based on the quantitative analysis using recognition accuracies reported in the literature, our study identies state-of-the-art deep architectures in action recognition and then provides current trends and open problems for future works in this led.This work was supported by the Cen-tre d'Etudes et d'Expertise sur les Risques, l'environnement la mobilité et l'aménagement (CEREMA) and the UC3M Conex-Marie Curie Program.No publicad

    Learning and Recognizing Human Action from Skeleton Movement with Deep Residual Neural Networks

    Get PDF
    This paper has been presented at 8th International Conference of Pattern Recognition Systems (ICPRS 2017).Automatic human action recognition is indispensable for almost artificial intelligent systems such as video surveillance, human-computer interfaces, video retrieval, etc. Despite a lot of progresses, recognizing actions in a unknown video is still a challenging task in computer vision. Recently, deep learning algorithms has proved its great potential in many vision-related recognition tasks. In this paper, we propose the use of Deep Residual Neural Networks (ResNets) to learn and recognize human action from skeleton data provided by Kinect sensor. Firstly, the body joint coordinates are transformed into 3D-arrays and saved in RGB images space. Five different deep learning models based on ResNet have been designed to extract image features and classify them into classes. Experiments are conducted on two public video datasets for human action recognition containing various challenges. The results show that our method achieves the state-of-the-art performance comparing with existing approachesThis work was supported by the Cerema Research Center and Universidad Carlos III de Madrid. Sergio A. Velastin has received funding from the European Unions Seventh Framework Programme for Research, Technological Development and demonstration under grant agreement No 600371, el Ministerio de Economía, Industria y Competitividad (COFUND2013-51509) el Ministerio de Educación, cultura y Deporte (CEI-15-17) and Banco Santander

    Exploiting deep residual networks for human action recognition from skeletal data

    Get PDF
    The computer vision community is currently focusing on solving action recognition problems in real videos, which contain thousands of samples with many challenges. In this process, Deep Convolutional Neural Networks (D-CNNs) have played a significant role in advancing the state-of-the-art in various vision-based action recognition systems. Recently, the introduction of residual connections in conjunction with a more traditional CNN model in a single architecture called Residual Network (ResNet) has shown impressive performance and great potential for image recognition tasks. In this paper, we investigate and apply deep ResNets for human action recognition using skeletal data provided by depth sensors. Firstly, the 3D coordinates of the human body joints carried in skeleton sequences are transformed into image-based representations and stored as RGB images. These color images are able to capture the spatial-temporal evolutions of 3D motions from skeleton sequences and can be efficiently learned by D-CNNs. We then propose a novel deep learning architecture based on ResNets to learn features from obtained color-based representations and classify them into action classes. The proposed method is evaluated on three challenging benchmark datasets including MSR Action 3D, KARD, and NTU-RGB+D datasets. Experimental results demonstrate that our method achieves state-of-the-art performance for all these benchmarks whilst requiring less computation resource. In particular, the proposed method surpasses previous approaches by a significant margin of 3.4% on MSR Action 3D dataset, 0.67% on KARD dataset, and 2.5% on NTU-RGB+D dataset

    Towards safer level crossings: existing recommendations, new applicable technologies and a proposed simulation model

    Get PDF
    Every year,more than 400 people are killed in over 1,200 accidents at road-rail level crossings in the European Union. Together with tunnels and specific road black spots, level crossings have been identified as being a particular weak point in road infrastructure, seriously jeopardizing road safety. In the case of railway transport, level crossings can represent as much as 29% of all fatalities caused by railway operations. Up to now, the only effective solution appears to involve upgrading level crossing safety systems even though in more than 90% of cases the primary accident cause is inadequate or improper human behavior rather than any technical, rail-based issue. This article provides results of research done on possible technological solutions to reduce the number of accidents at level crossings and demonstrate the effectiveness of the latter. Elements of these recommendations and related research activities constitute the main focus of the research work described in this paper. It is organized as follows: In Section 2, we consider statistical data related to LX accidents in certain given European countries. These statistics as well as a European Commission Directive related to safety targets are analyzed and the main trends are drawn. The study was carried out on the basis of the classification by the European Railway Agency of active LXs and passive LXs. These results form the foundation for the work described in Section 3. Section 3 focuses on advanced technology to improve LXs safety. The main thrust of the study is to evaluate low-cost, standard technology that can contribute to a direct decrease in the number of accidents, at an affordable cost. Existing surveillance technologies already used in rail or road transport are first considered. To facilitate LX bimodality, special emphasis is put on technical solutions which have already demonstrated high efficiency in both environments. In Section 4, the mode of operation of each potential solution is modeled and evaluated considering several operational scenarii, in order to evaluate the aggregate benefits of all the input. Setting models to describe the dynamics surrounding the LX environment will prepare a basis to support the decision making process of a joint rail and road sector strategy on how to control LXs. Finally, section 5 brings the study to a close with a list of the main areas in which to concentrate our future work

    Learning to Recognize 3D Human Action from A New Skeleton-based Representation Using Deep Convolutional Neural Networks

    Get PDF
    Recognizing human actions in untrimmed videos is an important challenging task. An effective 3D motion representation and a powerful learning model are two key factors influencing recognition performance. In this paper we introduce a new skeletonbased representation for 3D action recognition in videos. The key idea of the proposed representation is to transform 3D joint coordinates of the human body carried in skeleton sequences into RGB images via a color encoding process. By normalizing the 3D joint coordinates and dividing each skeleton frame into five parts, where the joints are concatenated according to the order of their physical connections, the color-coded representation is able to represent spatio-temporal evolutions of complex 3D motions, independently of the length of each sequence. We then design and train different Deep Convolutional Neural Networks (D-CNNs) based on the Residual Network architecture (ResNet) on the obtained image-based representations to learn 3D motion features and classify them into classes. Our method is evaluated on two widely used action recognition benchmarks: MSR Action3D and NTU-RGB+D, a very large-scale dataset for 3D human action recognition. The experimental results demonstrate that the proposed method outperforms previous state-of-the-art approaches whilst requiring less computation for training and prediction.This research was carried out at the Cerema Research Center (CEREMA) and Toulouse Institute of Computer Science Research (IRIT), Toulouse, France. Sergio A. Velastin is grateful for funding received from the Universidad Carlos III de Madrid, the European Union’s Seventh Framework Programme for Research, Technological Development and demonstration under grant agreement N. 600371, el Ministerio de Economia, Industria y Competitividad (COFUND2013-51509) el Ministerio de Educación, cultura y Deporte (CEI-15-17) and Banco Santander
    corecore